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1. Data defining a phoneme and word lattice, the data 
comprising: 

data for defining a plurality of nodes within the 
lattice amd a plurality of links connecting the nodes 
within the lattice; 

data \associating a plurality of phonemes with a 
respective \plurality of links; and 

data associating at least one word with at least one 
of said links. 



according 



2 . Data 
data defining 
in blocks of 



3 . Data /slc 
defining 



to any preceding claim, wherein said 
said phoneme and word lattice is arranged 
node s. 



c ording to claim 1 A further comprising data 
stamp information /for each of said nodes. 



4. Data according claim 3, arranged in blocks of 
equal time duration. 

5. Data according to claim 2, further comprising data 
defining each blocks location within said database. 
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6 . Data 
defining a 
further dat 
wherein saic. 
with said t 



c ccc 



ording to claim 3, wherein said data 
phoneme and word lattice is associated with 
a defining a time sequential signal, and 
time stamp information is time synchronised 
me sequential signal. 



7. Data according to claim 6, wherein said further data 
defines an audio and/or video signal. 



35 



8, 



Data according to claim 7, wherein said further data 
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ines at least speech data and wherein said data 
defining said phoneme and word lattice is derived from 
said\ further data. 




9. Data according to claim 8, wherein said speech data 
comprises audio data and wherein said data defining said 
phoneme and word lattice is derived by passing said audio 
signal through an automatic speech recognition system. 
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10. Datla according to claim 8, wherein said speech data 
defines the parol of a plurality of speakers, and wherein 
said datai defines a separate phoneme and word lattice for 
the parol! of each speaker. 
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11. Data 
defining 
associal 



iac^r^ing--t:0~^laim 1, further comprising data 
weighting for\the phonemes and/or words 
with said links . 



12. Da-b^i Recording 
said nodes 



.aim 1, wherein at least one of 



is connected to a plurality of other nodes by 
a plurality of links. 

13. Data according to claim 12, wherein at least one of 
said plurality of links connecting said node to said 
plurality of other nodes is associated with a phoneme and 
wherein at least one of said links connecting said node 
to said plurality of other nodes is associated with a 
word. 
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14. A methpd of searching a database comprising data 
according to\ any preceding claim, in response to an input 
query, the method comprising the steps of: 

generating phoneme data and/or word data 
corresponding \to the input query; 

searching \the phoneme and word lattice using the 



\ 
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phoneme and /or word data generated for the input query; 
and \ 

outputting search results in dependence upon the 
results of said searching step. 
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15. A \ method according to claim 14, wherein said 
searching step comprises the steps of: 

(i) \ searching the phoneme and word lattice using the 
word da-fta generated for the user's input query to 
identify \ similar words within the phoneme and word 
lattice; 

(ii) l selecting one or more portions of the phoneme 
and word ll^fe-r^e for^fur^ther searching in response to the 
results yo% said word search; and 

(iii) searching saic^/ one or more selected portions 
of the\g?honeme and wo^jd lattice using the phoneme data 
generatecT^-er— ±Jxe_jusesr ' s input query. 



met lod 



16. A 
of the woird 
phoneme se 
the databas 



17. A 
search is 
by the us 
from the 



according, to claim 15 , wherein the results 
search are oujtput to the user before the 
arch is performed on the selected portions of 
e . 



met lod 



er 



according to claim 16 , wherein said phoneme 
only performed in response to a further input 
in response to the outputting of the results 
search. 



word 



18. A method according to claim 15 , wherein said phoneme 
search is aarried out by identifying a number of features 
within the \phoneme sequence corresponding to the user's 
input query\and identifying similar features within the 
data defining said phoneme lattice within the database. 
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19. A method according to claim 18, wherein each of said 



features represents a unique sequence of phonemes within 
the \phoneme data of the user's input query. 

20. \A method according to claim 19 , wherein said phoneme 
searcti employs a cosine measure to indicate the 
similarity between the phoneme data corresponding to the 
user 1 ^ input query and the phoneme data within the 
database. 

21. A method according to claim 14, wherein said search 
results are output to a display. 



22. A method according to claim 14 , wherein said input 
query by Ithe us^r_i§_input by voice, and wherein said 
step of g€yar^rating phonem&Ndata and word data employs an 



automa 




speech recognition system 



23. A\method according to claim 14, wherein said input 
query ife^^a typed/^Ti^put and wherein said step of 
generating phoneme pata and word data employs a text-to- 
phoneme co iverter . 



24. An apparatus for searching a database comprising 
data according to claim 1, in response to an input query, 
us comprising: 

for generating phoneme data and/or word data 
ng to the input query; 

for searching the phoneme and word lattice 
honeme and/or word data generated for the 



the appara^ 
means 

correspond! 
means 

using the 




input query] 

means fior outputting search results in dependence 
upon the outfout from said searching means. 



25. An apparatus according to claim 24, wherein said 
searching means comprises: 



(k) means for searching the phoneme and word lattice 
using the word data generated for the user's input query 
to identify similar words within the phoneme and word 
lattice; 

(ii^ means for selecting one or more portions of the 
phoneme \and word lattice for further searching in 
response to the results of said word search; and 

(iii« means for searching said one or more selected 
portions of the phoneme and word lattice using the 
phoneme data generated for the user's input query • 

26. An apparatus according to claim 25 , wherein said 
output means is operable to output the results of the 
word search! to the user before the phoneme search is 
performed on the selected portions of the database. 



27* An appah:a£*rS accord±nq to claim 26, wherein said 
phoneme searcm is only performed in response to a further 
input by theMuser in res 
results from the^wt^d^ssi 



input by theMuser in response to the outputting of the 



28. An apparatus accordingsto claims 25 , wherein said 
phoneme search! is carried out dv identifying a number of 
features withim the phoneme sequence corresponding to the 
user's input query and identifying similar features 
within the data| defining said phoneme lattice within the 
database . 

29. An apparatus according to claim 28 , wherein each of 
said features represents a unique sequence of phonemes 
within the phoneme data of the user's input query. 



30. An apparatus \ according to claim 29, wherein said 
phoneme search employs a cosine measure to indicate the 
similarity between the phoneme data corresponding to the 
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usher's input query and the phoneme data within the 
database . 



31. \ An apparatus according to claim 24, wherein said 
output means comprises a display. 
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32. An apparatus according to claim 24, wherein said 
input buery by the user is a voice query, and wherein 
said mqans for generating phoneme data and word data 
comprises an automatic speech recognition system which is 
operable\to generate said phoneme data and a word decoder 
which is operable to generate said word data. 
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33. An apparatus according to claim 24, wherein said 
input query is a typed query and wherein said means for 
generating 1 phorreme data-^and word data comprises a text- 
to-phoneme Iconverter whichN^is operable to generate said 
phoneme datja . 

34. An apparatus fcfr~ generating annotation data for use 
in annotating a daVa /file comprising audio data, the 
apparatus comprising : 

an automatic spefeqh recognition system for 
generating phoneme data for atidio data in the data file; 
a word c ecoder for identifying possible words within 
data generated by the automatic speech 
system; and 

generating means for generating annotation data by 
combining the generated phoneme data and the decoded 
words . 



the phoneme 
recognition 
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35. An apparatus for generating annotation data for use 
in annotating a data file comprising text data, the 
apparatus comprising : 

a text to phoneme converter for generating phoneme 
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dat<a for text data in the data file; and 

generating means for generating annotation data by 
combining the phoneme data and words in the text data. 
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36. !An apparatus for generating annotation data for use 
in annotating a data file, the apparatus comprising: 

ilnput means for receiving an input voice signal; 

speech recognition means for converting the input 
voice signal into phoneme data and words; and 

generating means for generating annotation data by 
combining the phoneme data and the words . 

37. An apparatus for generating annotation data for use 
in annotating a data file, the apparatus comprising: 

input means for receiving a typed input from a user; 

corwerting means for converting words in the typed 
input into phoneme^data-; — . 

generati 
combining t 



means for generating annotation data by 
phoneme data and words in the typed input. 
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38. An abparatCrs -fo^-gederating annotation data for use 
in annotating a data fiie, the apparatus comprising: 

means for receiving Nonage data representative of 

text; 

character recognition means for converting said 
image data\ into text data; 

converting means for converting words in the text 
data into phoneme data; and 

generating means for generating annotation data by 
combining the phoneme data and words in the text data. 
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39. An apparatus according to claim 34, wherein said 
annotation data defines a phoneme and word lattice and 
wherein said generating means comprises: 

(i) means for generating data defining a plurality 
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of \riodes within the lattice and a plurality of links 
connecting the nodes within the lattice; 

i(ii) means for generating data associating a 
plurality of phonemes of the phoneme data with a 
respective plurality of links; and 

(lii) means for generating data associating at least 
one of the words with at least one of said links. 



10 



40. An Apparatus according to claim 39, wherein said 
generating means is operable to generate said data 
defining sl^id phoneme and word lattice in blocks of said 
nodes . 



=15 



41. An apparatus according to claim 39 , wherein said 
generating i^eans is operable to generate data defining 
time stamp ijtifertnation for each--Qf said nodes. 



!:20 



25 



30 



42. An [apparatus according to claim 41, wherein said 
generatin^means is arranged to generate said phoneme and 
word lattice ^at^r^-n^Lopfcs of ^gqual time duration. 

43. An apparatus according to claim 40, wherein said 
generating means is operabte.to generate data which 
defines each block's location within a database. 



44. An apparatus according to claim 41, wherein said 
data file includes a time sequential signal, and wherein 
said generating fheans is operable to generate time stamp 
data which is timej synchronised with said time sequential 
signal . 



45. An apparatus According to claim 44, wherein said 
time sequential signal is an audio and/or video signal. 



35 



46. An apparatus according to claim 34, wherein said 
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audio data includes audio data which defines the parol of 
d plurality of speakers , and wherein said generating 
means is operable to generate data which defines separate 
phoneme and word annotation data for the parol of each 
speaker. 
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47.\An apparatus according to claim 35, wherein said 
textVdata defines the parol of a plurality of speakers, 
and wnerein said generating means is operable to generate 
data defining separate phoneme and word annotation data 
for thle parol of each speaker. 



^L5 



48. An\ apparatus according to claim 34, wherein said 
speech recognition--^y^t^m is operable to generate data 
def ining\ a /Weighting for the phonemes in the phoneme 
data. 



mo 
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49. An abpctratus ^acco rding £0 claim 34, wherein said 
word decoqer is operable to generate data defining a 
weighting for the words identified within said phoneme 
data. 




50. An apparatus according toXclaim 39, wherein said 
means for generating data defining^ plurality of nodes 
and a pluralivty of links is operable to define at least 
one node whicn is connected to a plurality of other nodes 
by a plurality of links. 
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51. An apparatus according to claim 50, wherein at least 
one of said plurality of links connecting said node to 
said plurality \of other nodes is associated with a 
phoneme and whereUn at least one of said links connecting 
said node to said \plurality of other nodes is associated 
with a word. 
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>2. An apparatus according to claim 36 , wherein said 
)eech recognition means is operable to generate data 
defining a weighting for the phonemes in the phoneme 
data. 

53. \ An apparatus according to claim 52 , wherein said 
speech recognition means is operable to generate data 
defiAing a weighting for the words within the word data. 
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54. i^n apparatus according to claim 36, further 
comprising means for associating said annotation data 
with said data file. 



NL5 



55. An lapparatus according to claim 37, wherein said 
converting means comprises an automatic phonetic 
transcri^tiorTT unit^wiiit>h^ generates said phoneme data from 
words (wi1 



:ion unit whit 
lin the typed input . 
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56. An atopa^arkos gcc ordxnq to claim 38, wherein said 

converting! mearfs comprises an automatic phonetic 
transcription uni^s^Jiich generates said phoneme data from 
words within the text^^ata^ output by said character 
recognition! means . 
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57. An apparatus according to claim 38, further 
comprising means for associating said annotation data 
with either said image data representative of said text 
or with said text data. 
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58. An apparatus according to claim 38, wherein said 
receiving me^ns comprises a document scanner or a 
facsimile machVLne. 
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59. A method of generating annotation data for use in 
annotating a data file comprising audio data, the method 
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comprising the steps of: 

\ using an automatic speech recognition system to 
generate phoneme data for audio data in the data file; 

\ using a word decoder to identify possible words 
within the phoneme data generated by the automatic speech 
recognition system; and 

Generating annotation data by combining the 
generated phoneme data and the decoded words . 

60. A\ method of generating annotation data for use in 
annotating a data file comprising text data, the method 
comprising the steps of: 

using a text to phoneme converter to generate 
phoneme data for text data in the data file; and 

gen^r^tdrfig~liMot^ by combining the phoneme 

data ancrwords in the text da.ta. 

61. A metjltod-^f^generatin data for use in 
annotating\ a data f the method comprising the steps 
of: \ V / 

receiving an inpu\voice signal; 

processing the inputN^oice signal using a speech 
recognition teystem to generate phoneme data and word data 
for the input voice signal; and 

generating annotation data by combining the phoneme 
data and the \ word data generated for the input voice 
signal. \ 

62. A method ©f generating annotation data for use in 
annotating a darta file, the method comprising the steps 
of: \ 

receiving a typed input; 

converting Words in the typed input into phoneme 
data; and \ 

generating annotation data by combining the phoneme 



lata and words in the typed input , 



631 A method of generating annotation data for use in 
annotating a data file, the method comprising the steps 
of/ 

receiving image data representative of text; 
converting said image data into text data using a 
character recognition unit; 

:onverting words in the text data into phoneme data; 

and 

jenerating annotation data by combining the phoneme 
data And words within the text data, 



64. £l method according to claim 59 f wherein said 
annotation data defines a phoneme and word lattice and 
whereik 
(' 



'generatirig^step comprises the steps of: 



generating data yefining a plurality of nodes 
within! "fct^e lattice and a /plurality of links connecting 
the nocjes witfi±n-^the ^ lat tice ; 

(ii) generating data associating a plurality of 
phonemes of the phoneme data with a respective plurality 
of links; and 

(iii) generating id^ta associating at least one of 
the words with at least on^oi said links - 



65. A tnethod according to claim 64 , wherein said 
generating step generates said data defining said phoneme 
and word \lattice in blocks of said nodes. 



66. A method according to claim 64, wherein said 
generating\ step generates data defining time stamp 
information for each of said nodes. 



67. A method according to claim 66, wherein said 
generating step generates said phoneme and word lattice 
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lata in blocks of equal time duration. 



68. A method according to claim 65, wherein said 
generating step generates data which defines each block's 
location within a database. 



69. \ A method according to claim 66 , wherein said data 
file\ includes a time sequential signal, and wherein said 
generating step generates time stamp data which is time 
synchronised with said time sequential signal. 

70. A\ method according to claim 69, wherein said time 
sequential signal is an audio and/or video signal. 

71. A feiethod according to claim 59, wherein said audio 
data inbiudes' audio data which defines the parol of a 



plurality of speakers, and yherein said generating step 
generatas^data which defines separate phoneme and word 
annotation data f^PTTEe p^irol of each speaker. 

72. A method acc6o:ding to claim 60, wherein said text 
data defines the parql of a plurality of speakers, and 
wherein s\aid generating, step generates data defining 
separate rihoneme and word annotation data for the parol 
of each speaker. 

73. A method according to claim 59, wherein said speech 
recognition system generates data defining a weighting 
for the phonemes associated with said links. 



74. A method according to claim 59, wherein said word 
decoder generates data defining a weighting for the words 
associated winh said links. 



75. A method according to claim 64, wherein said step of 



\ 

defining a plurality of nodes and a plurality of links 
defines at least one node which is connected to a 
plurality of other nodes by a plurality of links. 
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76. & method according to claim 75 , wherein at least one 
of said plurality of links connecting said node to said 
plurality of other nodes is associated with a phoneme and 
whereiji at least one of said links connecting said node 
to saifl plurality of other nodes is associated with a 
word. 

77. A iAethod according to claim 61 , wherein said speech 
recognition system generates data defining a weighting 
for the phonemes associated with said links. 



78. A mefchoc^ according to^claim 61, wherein said speech 
recognition system generates data defining a weighting 
for the woVds associaEett == with said links. 
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79. A meth&d according tbssClaim 61 , further comprising 
the step of \associating saia^annotation data with said 
data file. 
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80. A method according to claim 62, wherein said 
converting stem uses an automatic phonetic transcription 
unit which generates said phoneme data for words within 
the typed inputl 
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81. A method according to claim 63 , wherein said step of 
converting words into phonemes uses an automatic phonetic 
transcription unit which generates said phoneme data for 
words within the \text data output by said character 
recognition unit. 
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82. A method accorcJing to claim 63, further comprising 



tfte step of associating said annotation data with either 

sai^l received image data or with said text data. 

\ 

83. \a method according to claim 63, wherein said 
receiving step uses a document scanner or a facsimile 
mac him 



84. A \ method of searching a data file including 
annotation data in response to an input query, the method 
comprising the steps of: 

generating phoneme data and word data corresponding 
to the input query; 

searching the data file based on the phoneme data 
and/or the word data and the annotation data; and 

outputti^g---ffearcii — results in dependence upon the 
result ofl satxd searching ~' 





85. A melthoa^ceordi^q^ to claim 84, wherein said 
annotation! data definesN^a phoneme and word lattice 
comprising! 

(i) data for defining a plru^ality of nodes within 
the latticeland a plurality of links connecting the nodes 
within the lattice; 

(ii) data for associating a plurality of phonemes of 
the phoneme \iata with a respective plurality of links; 
and 

(iii) data for associating at least one word with at 
least one of said links. 

86. A method fbr storing a data file into a database, 
the method comprising the steps of: 

combining Ahe data file with annotation data 
corresponding to \ the data file, the annotation data 
including phoneme data; and 

storing the data file with the annotation data. 



8i7 . An apparatus for searching a data file including 
annotation data, in response to an input query , the 
apparatus comprising : 

V means for generating phoneme data and word data 
corresponding to the input query; 

Wans for searching a data file based on the phoneme 
data land/or the word data and the annotation data; and 

means for outputting a search result in dependence 
upon t\ie result of said searching means. 



88. An\ apparatus according to claim 87, wherein said 
annotation data defines a phoneme and word lattice, and 
comprises 

(i)\data defining a plurality of nodes within the 
lattice ^nd a plurality of links connecting the nodes 
within 

.i) Idata associating^a plurality of phonemes of the 
le data with a respective plurality of links; and 
1 a as s-Qci-atj . n'q at least one word with at 
\f said rijciks. 

89. An apparatus for storirtg^a data file into a 
database, tl\e apparatus comprising: 

means ±lpr inputting the data file and annotation 
data correspdiiding to the data file, the annotation data 
including phoneme data; and 

means f ox\ storing the data file with the annotation 

data, 



90. A medium fbr storing a data file, the data file 
comprising : 

an audio datW; and 

an annotation^ data corresponding to the audio data, 
said annotation da&a including phoneme data. 



>1. A medium for storing a data file, the data file 
comprising: 

video data; ( 

audio data corresponding to the video data; and 
annotation data corresponding to the audio data, the 
annotation data including phoneme data. 

92. A Vnedium for storing a data file, the data file 
comprising : 

texii data; and 

annotation data corresponding to the text data, said 
annotatiom data including phoneme data. 



93. Data 

annotation 

annotation 



iio data and further comprising 
correspondirtg.to the audio data, which 
:a includes phoneme data. 



94. Data inclydilTq^^4 i de9^ and further comprising 
audio data corresponding to the video data and annotation 
data corresponding to the audio data, which annotation 
data includes phQneme data. 

95. Data including text data, "the data further 
comprising annotation data corresponding to the text 
data, which annotation data includes phoneme data, 



96. A data carrier carrying data according to claim 1 or 
processor implementabVe instructions for controlling a 
processor to implement the method of claim 14 or 59 or 
84. 




97. Processor implementable instructions for controlling 
a processor to implement \he method of claim 14 or 59 or 
84. 
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